The eSNV-detect: a computational system to identify expressed single nucleotide variants from transcriptome sequencing data

نویسندگان

  • Xiaojia Tang
  • Saurabh Baheti
  • Khader Shameer
  • Kevin J. Thompson
  • Quin Wills
  • Nifang Niu
  • Ilona N. Holcomb
  • Stephane C. Boutet
  • Ramesh Ramakrishnan
  • Jennifer M. Kachergus
  • Jean-Pierre A. Kocher
  • Richard M. Weinshilboum
  • Liewei Wang
  • E. Aubrey Thompson
  • Krishna R. Kalari
چکیده

Rapid development of next generation sequencing technology has enabled the identification of genomic alterations from short sequencing reads. There are a number of software pipelines available for calling single nucleotide variants from genomic DNA but, no comprehensive pipelines to identify, annotate and prioritize expressed SNVs (eSNVs) from non-directional paired-end RNA-Seq data. We have developed the eSNV-Detect, a novel computational system, which utilizes data from multiple aligners to call, even at low read depths, and rank variants from RNA-Seq. Multi-platform comparisons with the eSNV-Detect variant candidates were performed. The method was first applied to RNA-Seq from a lymphoblastoid cell-line, achieving 99.7% precision and 91.0% sensitivity in the expressed SNPs for the matching HumanOmni2.5 BeadChip data. Comparison of RNA-Seq eSNV candidates from 25 ER+ breast tumors from The Cancer Genome Atlas (TCGA) project with whole exome coding data showed 90.6-96.8% precision and 91.6-95.7% sensitivity. Contrasting single-cell mRNA-Seq variants with matching traditional multicellular RNA-Seq data for the MD-MB231 breast cancer cell-line delineated variant heterogeneity among the single-cells. Further, Sanger sequencing validation was performed for an ER+ breast tumor with paired normal adjacent tissue validating 29 out of 31 candidate eSNVs. The source code and user manuals of the eSNV-Detect pipeline for Sun Grid Engine and virtual machine are available at http://bioinformaticstools.mayo.edu/research/esnv-detect/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transcriptome Sequencing of Guilan Native Cow in Comparison with bosTau4 Reference Genome

RNA-sequencing is a new method of transcriptome characterization of organisms. Based on identity and relatedness, there are large genetic variations among different cattle breeds. The goal of the current study was to sequence the transcriptome of Guilan native cow and compare with available reference genome using RNA-sequencing method. Blood samples were collected from 14 Guilan native cows and...

متن کامل

Mitochondrial Genetic Variation in Iranian Infertile Men with Varicocele

Objective Several recent studies have shown that mitochondrial DNA mutations lead to major disabilities and premature death in carriers. More than 150 mutations in human mitochondrial DNA (mtDNA) genes have been associated with a wide spectrum of disorders. Varicocele, one of the causes of infertility in men wherein abnormal inflexion and distension of veins of the pampiniform plexus is observe...

متن کامل

Joint Variant and De Novo Mutation Identification on Pedigrees from High-Throughput Sequencing Data

The analysis of whole-genome or exome sequencing data from trios and pedigrees has been successfully applied to the identification of disease-causing mutations. However, most methods used to identify and genotype genetic variants from next-generation sequencing data ignore the relationships between samples, resulting in significant Mendelian errors, false positives and negatives. Here we presen...

متن کامل

Gene Expression, Single Nucleotide Variant and Fusion Transcript Discovery in Archival Material from Breast Tumors

Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs) and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor ...

متن کامل

PVAAS: identify variants associated with aberrant splicing from RNA-seq

MOTIVATION RNA-seq has been widely used to study the transcriptome. Comparing to microarray, sequencing-based RNA-seq is able to identify splicing variants and single nucleotide variants in one experiment simultaneously. This provides unique opportunity to detect variants that associated with aberrant splicing. Despite the popularity of RNA-seq, no bioinformatics tool has been developed to leve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 42  شماره 

صفحات  -

تاریخ انتشار 2014